Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd: rewrite push/pull et al. #1602

Merged
merged 14 commits into from
Aug 18, 2020
Merged

cmd: rewrite push/pull et al. #1602

merged 14 commits into from
Aug 18, 2020

Conversation

jorgeorpinel
Copy link
Contributor

Continuation of #1384 with out-of-scope details. Waiting for that PR to close.

@shcheklein shcheklein temporarily deployed to dvc-landing-cmd-target--n7hdsb July 20, 2020 06:55 Inactive
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-landing-cmd-target--n7hdsb July 20, 2020 06:57 Inactive
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-landing-cmd-target--n7hdsb July 20, 2020 06:58 Inactive
@jorgeorpinel jorgeorpinel changed the title cmd: rerrite push/pull et al. cmd: rewrite push/pull et al. Jul 20, 2020
@jorgeorpinel jorgeorpinel marked this pull request as ready for review August 8, 2020 18:37
@jorgeorpinel jorgeorpinel changed the base branch from fix-886 to master August 8, 2020 18:38
@jorgeorpinel jorgeorpinel requested a review from shcheklein August 8, 2020 18:39
@jorgeorpinel
Copy link
Contributor Author

I think this is ready to merge. Needs an approval though.

storage. It will not upload files associated with earlier commits in the
<abbr>repository</abbr> (if using Git), nor will it upload files that have not
changed.
- The push command checks the appropriate `dvc.lock` and `.dvc` files in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dvc.yaml?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a command-reference, but I don't think this information should be here (perhaps on Dvc files?).

Also, @jorgeorpinel, dvc.yaml is a discovery file which dvc reads to find outputs/dependencies, etc. and dvc.lock is a file to correlate those outputs/deps with the exact version through the checksums. dvc.lock is never used if there's no corresponding deps/outs/stages in dvc.yaml. So, either both dvc.yaml and dvc.lock are read together or we could just say dvc.yaml is read (and consider dvc.lock as an implementation detail).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a command-reference, but I don't think this information should be here

could you clarify please what information exactly do you mean?

So, either both dvc.yaml and dvc.lock are read together or we could just say dvc.yaml is read (and consider dvc.lock as an implementation detail) ..

we come back to a single term "DVC files" w/o specifying details with some tooltip that mentions that DVC files == dvc.yaml + dvc.lock + .dvc . All of those files play their role here. We can expand later and bit explicit if needed where it is needed.

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Aug 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dvc.yaml is mentioned just before: "targets ... limit what to pull. It accepts ... and stage names (found in dvc.yaml)". Then, when we say "push checks the appropriate dvc.lock files", by "appropriate" we mean corresponding to what was found in dvc.yaml. But let me see if I can clarify this further... ⌛

just say dvc.yaml is read (and consider dvc.lock as an implementation detail

This is not a bad idea... We do go over implementation details in cmd refs though, in fact they're probably the lowest-level docs we have. But we could def. hide some of those details in expandable sections.
I'm just not sure dvc.lock should be considered a low-level detail; It's pretty visible to the user.

we come back to a single term "DVC files" w/o specifying details with some tooltip that mentions that DVC files

Yes, let's try to merge this one as best we can though, and follow up on that in #1663 ?

See also iterative/dvc/issues/4393.

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Aug 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me see if I can clarify this further... ⌛

I turn this whole bullet list into a more simple paragraph in dc87fe5. PTAL

with `git commit` and `git push`).
The `dvc push` command uploads data to
[remote storage](/doc/command-reference/remote). It doesn't save any changes to
the code, `dvc.yaml`, or `.dvc` files (that should be saved with `git commit`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should include dvc.lock here ...

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Aug 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. That whole sentence is kind of a note though, now that I think about it. I changed it a little in 4717b78. It's still not perfect but we'll have to review this again soon, when addressing #1663.

`.dvc` files in the <abbr>workspace</abbr>. The command options will either
limit or expand the set of stages or `.dvc` files to consult.
Without arguments, it uploads all files and directories missing from remote
storage, found as <abbr>outputs</abbr> in the stages (in `dvc.lock`) or `.dvc`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dvc.yaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yeah... Kinda. dvc.lock is where you find the output file names — that's what this meant. But probably this paragraph should be simplified and not even mention either one. See 246f446.

Copy link
Member

@shcheklein shcheklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like dvc.yaml, dvc.lock, .dvc is still a problem in many places ... I really suggest to stop and find a simple solution, something similar to DVC files + expand where we explain exact mechanism (reads dvc.yaml to get tracked files and directories lists, then dvc.lock to get hashes ... etc)

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Aug 16, 2020

stop and find a simple solution

I agree. But I don't think there's a simple solution if we want to be thorough in the explanations until iterative/dvc/issues/4278 or related discussions are figured out. What we can do now is avoid going into details (where stage names are listed, where hash values are, etc.) and just talk about stages and maybe .dvc files. Users will have to know that stages are in dvc.yaml and what dvc.lock does.

something similar to DVC files + expand where we explain exact mechanism

Unfortunately, even if I agreed on that term, I think that we can't go back to pre 1.x simplicity. This is because DVC 1.x is more complex, since it introduced dvc.yaml and lockfiles in addition to DVC-files.

But I agree that expandable sections can be our friends here.

My only question is can we merge this work so far and follow the above going fwd? (even have an immediate PR on these same docs) Or do you see still specific problems in these changes @shcheklein? Thanks

@shcheklein
Copy link
Member

My only question is can we merge this work so far and follow the above going fwd? (even have an immediate PR on these same docs) Or do you see still specific problems in these changes @shcheklein? Thanks

the problem we don't merge this is not the "DVC metafiles" issue, but some actual factual problems (e.g. dvc.lock -> dvc.yaml).


The default remote is used (see `dvc config core.remote`) unless the `--remote`
> Note that pulling data does not change any `dvc.yaml` or `.dvc` files, nor
> does it save any changes to the code, `dvc.lock`, or `.dvc` files (that should
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why would I expect it to save changes in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I copied this from pull without editing accordingly. Updated.

I've also applied similar notes to fetch and commit, but didn't push it in this PR for now. I'll make another small PR soon.

Copy link
Member

@shcheklein shcheklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to move forward. But i'm quite lost with this PR. For push and pull I'm not sure what have we solved here. And both do not loo very well structured to be honest.

@jorgeorpinel jorgeorpinel temporarily deployed to dvc-landing-cmd-target--t5ysxj August 18, 2020 04:59 Inactive
@jorgeorpinel
Copy link
Contributor Author

OK. Well, I'll merge this to move forward and come back to cleaning up all these refs.

One thing that improved was the correctness of DVC metafile mentions, although it will have to be updated again soon. Another thing that improved is the consistency among these as well as fetch, and to a lesser degree also commit and status.

@jorgeorpinel jorgeorpinel merged commit 9175069 into master Aug 18, 2020
jorgeorpinel added a commit that referenced this pull request Aug 18, 2020
@jorgeorpinel jorgeorpinel deleted the cmd/target-granularity branch August 20, 2020 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants